Spatiotemporal characteristics of cortical activities of REM sleep behavior disorder revealed by explainable machine learning using 3D convolutional neural network

Isolated rapid eye movement sleep behavior disorder (iRBD) is a sleep disorder characterized by dream enactment behavior without any neurological disease and is frequently accompanied by cognitive dysfunction. The purpose of this study was to reveal the spatiotemporal characteristics of abnormal cortical activities underlying cognitive dysfunction in patients with iRBD based on an explainable machine learning approach. A convolutional neural network (CNN) was trained to discriminate the cortical activities of patients with iRBD and normal controls based on three-dimensional input data representing spatiotemporal cortical activities during an attention task. The input nodes critical for classification were determined to reveal the spatiotemporal characteristics of the cortical activities that were most relevant to cognitive impairment in iRBD. The trained classifiers showed high classification accuracy, while the identified critical input nodes were in line with preliminary knowledge of cortical dysfunction associated with iRBD in terms of both spatial location and temporal epoch for relevant cortical information processing for visuospatial attention tasks.

recognition 18 . Our data, multichannel EEGs, can be converted to current density time series on cortical surfaces using source localization techniques 19 , which are essentially 3d spatiotemporal data.
In a recent study, we identified the spatial characteristics of dysfunctional cortical activities of patients with neurological disorders 20,21 based on a 2dCNN trained by 2d data representing current densities on the cortical surface within a critical temporal period, which is supposed to be crucial for working memory 22 . The temporal period was determined based on prior knowledge of the cognitive function under consideration, which may be misleading and has resulted in limitations in the objective identification of crucial characteristics solely based on a data-driven approach.
Here, we tried to discriminate cortical activities of iRBD patients from normal controls during cognitive function using 3dCNN, and to localize critical spatial location and temporal epoch, which reflects dysfunctional cortical activities associated with iRBD, by applying an explainable machine learning approach, that is, by identifying the input nodes of the CNN that play critical roles in the decision of the output. It is expected that the proposed method will contribute to elucidating the neural mechanism of abnormal brain activity in patients with iRBD, which cannot be revealed by conventional statistical analysis. Compared with our previous approach using 2dCNN, the 3dCNN-based method proposed here relies entirely on the data, without an a priori assumption on the critical temporal epoch.

Methods
Subjects and clinical screenings. A detailed description of the experimental procedures is presented in our previous paper 23 , and is briefly summarized here. Drug-naïve iRBD patients who visited Seoul National University Hospital were enrolled in this study. Normal controls without any sleep-related symptoms or neuropsychological diseases were screened via a survey and clinical interview. Experimental data were collected from 49 iRBD patients (aged 65.96 ± 5.94, 29 males) and 49 normal controls (aged 66 ± 6.37, 33 males). All experimental procedures performed in this study were approved by the Seoul National University Hospital Institutional Review Board (IRB Number 1406-100-589). All experiments were performed in accordance with relevant guidelines and regulations. Informed consent was obtained from all the subjects.
The subjects underwent neurological and cognitive tests before the main experiment. RBD symptom severity was evaluated using the Korean version of the RBD screening questionnaire (RBDQ-HK) 24 . Autonomic dysfunction was assessed using the Scales for Outcomes in Parkinson's Disease for autonomic symptoms (SCOPA-AUT) 25 . Sleep quality was assessed by using the Pittsburgh Sleep Quality Index (PSQI) 26 . Excessive daytime sleepiness was assessed using the Epworth Sleepiness Scale (ESS) 27 . Global cognitive function was evaluated using the Korean version of the Montreal Cognitive Assessment (MoCA) 28 and Mini-Mental State Examination (MMSE) 29 .
The subject demographics and cognitive test results are presented in Table 1. No significant difference between iRBD patients and normal controls was found in demographics, except for education. Patients showed significantly higher SCOPA-AUT and PSQI scores. The neuropsychological test results (Table 2) revealed that MMSE, MoCA total, attention, abstraction, memory recall, and orientation scores were significantly lower in iRBD patients than in normal controls (Table 2).
Subjects performed Posner's cueing task while multichannel EEG signals were being recorded 30 . In every single-trial, a cue stimulus was presented on the left or right side of the central fixation point, and then a target stimulus was presented in the same (valid) or opposite (invalid) position. The time interval between the cue and the target stimulus was either 200 ms (SOA 200 condition) or 1000 ms (SOA 1000 condition). Subjects were asked to press a button as soon as possible in response to the target stimuli. Five hundred trials were presented to subjects. EEG acquisition and preprocessing. Sixty-channel EEGs with a sampling frequency of 400 Hz were recorded based on 10-10 system. Two electrooculogram channels were placed on the left and right outer canthi to remove eye-related artifacts. Reference and ground electrodes were placed on the ear and AFz sites, respectively. The electrode impedances were maintained at below 10 kΩ. The acquired EEG signals were band-pass filtered for a frequency range of 0.1-70 Hz along with a 60 Hz notch filter. The recorded signals were re-referenced to the average of all the electrodes. Single-trial EEG, which is heavily contaminated by signal drift, high ampli- www.nature.com/scientificreports/ tude above 100 μV, and non-stationary noise with high-frequency fluctuations, was removed by visual inspection. Stationary noise, such as eye and muscle artifacts, was corrected using independent component analysis 31 .
Data analysis. The overall procedure for the data analysis is presented in Fig. 1. The preprocessed EEG signals were segmented into single-trial waveforms based on the target stimulus onset (− 1200 to 800 ms). Multichannel EEGs were transformed to cortical current density time series by weighted minimum norm estimation (wMNE) 32 cortical source estimation, which yielded 3d input data for the 3dCNN classifier. After successful  www.nature.com/scientificreports/ training, critical input nodes were identified so that the spatial and temporal characteristics of cortical activity significantly reflected the difference between patients with iRBD and normal controls. A similar procedure was performed using the 2dCNN for comparison. In this case, the critical temporal period was predetermined to be 200-350 ms, which is known to be important for visuospatial attention 33 . The cortical current density time series were converted into 2d images by averaging within this critical period.
Preparation of input data for CNN classifier. 3d input data were constructed by concatenating 2d images of the cortical current densities over multiple temporal points, as shown in Fig. 1A. After segmentation during − 1200 to 800 ms intervals, further lowpass filtering (< 30 Hz) and baseline correction were performed by subtracting the average amplitude between − 200 and 0 ms. EEG recordings were converted to current density time series over 0-800 ms on 15,002 equally-distributed points on cortical surfaces using the Brainstorm toolbox 34 . For the forward problem, a volume conduction model was constructed from the ICBM 152 anatomical template, which is a distributed boundary element method 35 . Weighted minimum norm estimation was applied to estimate the current source density distribution, as explained by Tadal et al. 36 . The 15,002 points on the cortical surface were first projected onto a sphere with registered coordinates in Brainstorm, and then the surface of the sphere was further projected onto a 2D plane using the Mollweide projection ( Fig. 1B) 37,38 . For each time point, a 2D image of the cortical current sources was generated by interpolating the values at 15,002 points onto an equally spaced 120 × 120 uniform grid. The pixel intensities of the 2D images were converted to z-scores via standardization. Then, the current densities within 50 ms epochs were averaged, resulting in 16 2d images during 0-800 ms. Thus, the dimension of the 3d input to the CNN was 120×120×16. Totally 47,513 3d data were generated. Of these, 23,553 were from 49 normal controls, and 23,960 were from 49 patients with iRBD. For the 2dCNN, the dimension of the data was 120 × 120 × 1, since a 2d image of cortical current density was obtained by averaging within 200-350 ms. This temporal epoch is known to be critical for visuospatial attentional processing during the Posner task, corresponding to N1 and P300 event-related potential (ERP) components 23,33,39 .
The structure of CNN classifier. The structure of the CNN classifier was devised based on the C3D model, which has been shown to be effective in learning spatiotemporal features from 3d video data 40,41 . The convolution module in the CNN consists of three repetitions of a convolutional layer, batch normalization layer, and max pooling layer, followed by two fully connected layers and one output layer that performs classification, as shown in Fig. 2A. The filter sizes of each convolution module were 64 µm, 128 µm, and 256 µm. The structures of the 2dCNN and 3dCNN classifiers are identical, except for the type of convolutional layer, filter size, and stride size. www.nature.com/scientificreports/ The kernel sizes of the convolutional layers of 3dCNN were 3 × 3 × 3, with stride sizes of 1. The max pooling layers had kernel and stride sizes of 2 × 2 × 2, except for the first layer. The kernel/stride size of the first maxpooling layer was 2 × 2 × 1.
For the 2dCNN, all convolutional layers had a kernel size of 3 × 3, with a stride size of 1. The pooling layers were max-pooling layers with kernel and stride sizes of 2 × 2. The filter size of the fully connected layer is 512. The activation functions for all nodes were rectified linear functions, except for the output layer nodes, for which the sigmoid activation function was adopted.
In addition, we performed an analysis of performance changes according to the depth of the network, including learning accuracy and robustness. Three structures were tested: shallow, standard, and deep (Fig. S1).
Training and test of the classifier. The training and evaluation of the CNN classifier consisted of two stages: pretraining and fine-tuning/evaluation, as shown in Fig. 3. First, the training data were prepared by eliminating all data from a single specific patient (SP) for pretraining. After successful pretraining, a transfer learning approach was applied to the SP, and the classification accuracy was evaluated. This procedure was repeated for all the 49 patients with iRBD. Training and testing of the CNN were performed using an AMD Ryzen  Fig. 3B). Random undersampling was applied to avoid the class imbalance problem so that the ratio of data points in the two classes (iRBD patients and normal controls) was 1:1 43 . The data were further divided randomly into training (90%) and validation (10%) sets. The weights and biases were initialized using the Kaiming method 44 . Owing to limited memory allocation capacity, the mini-batch size was set to 128. The binary crossentropy loss function was adopted and minimized using the Adam optimizer 45 . The optimal learning rate was determined to be within the range of 1 × 10 −8 to 1 using the learning rate range test proposed by Smith 46 . The weight decay was set as 1×10 −5 . The classifier was trained for 100 epochs and early stopping was applied if the validation accuracy did not improve after 10 epochs.
Fine-tuning and evaluation stage. For the fine-tuning of each SP, the input data for the training were constructed from the data of the SP and randomly selected data from normal controls that were not used for the pretraining, as shown in Fig. 3A. Data from healthy controls were included to avoid overfitting to a single class (iRBD patient class). 80% of the dataset was used for training, and 20% was used for the evaluation. During the training for the fine-tuning, only the parameters of the fully connected layers and output layer were adjusted, whereas those of the convolution layers were fixed to those obtained from the pretraining stage (the lower part of Fig. 3B). The convolution layers of the pretrained model are known for their ability to extract useful features from images that can be used for various image classification tasks 47,48 . Therefore, the convolutional layer is frozen to retain the pre-learned features, and only the fully connected layer is allowed to learn task-specific features from unseen patient data. However, if the new data differ significantly from the data used in the pretrained www.nature.com/scientificreports/ model, or if the fully connected layer does not learn task-specific features effectively, the convolutional layer can be fine-tuned to fit the new data. The learning rate and weight decay were set to 1/10 of the pretraining values to prevent overfitting 49,50 . All other parameters were set to be equal to those for pretraining.

Determination of critical input features.
Spatiotemporal characteristics of neural activity reflecting distinct difference between iRBD patients and normal controls were identified by finding the nodes in the input layer which contribute considerably to the decision of the CNN classifiers, i.e, by 'explaining' the CNN. Two representative methods for the 'explainable machine learning, ' layer-wise relevance propagation (LRP) and guided gradient-weighted class activation mapping (GGCAM), were adopted here 51,52 (Fig. 2B). LRP is a method for computing the relevance scores of the nodes in the input layer by repeated backpropagation, which decomposes a single node's output into the contributions of the nodes in the previous layer. Backpropagation for the relevance scores is performed as shown below in Eq. (1).
where l denotes the number of layers. R l+1 k indicates the relevance of k node in a higher layer, R l j indicates the relevance of j node in the lower layer. z jk denotes the influence of k neuron of the higher layer on j neuron of the lower layer.
Several improvements in the propagation rule of Eq. (1) have been presented 53 . We applied the LRP0 and LRP-gamma rules for the fully connected and convolutional layers, respectively, as proposed by Montavon et al. 51 . The source code for LRP is available at http:// heatm apping. org. The set of relevance scores for the input nodes provided a heatmap representing the contribution of each cortical point to the classifier output.
Gradient-weighted class activation mapping (Grad-CAM) is a method used to find the nodes that contribute greatly to the output based on the gradient of the output with respect to their activation 52 . For 3dCNN, the importance score of a node ijk , L ijk , is calculated as the product of its activation A n ijk and the average of the class score gradient of the feature map to which node ijk belongs (denoted by n ), as follows: where, where y denotes the output from the output layer, which corresponds to the class score. where i , j , and k represent the indices for the location of a node in a 3d feature map. For the 2dCNN classifier, the score of node ij is calculated in the same manner.
Guided GradCAM (GGCAM) was proposed by Selvaraju et al. 52 to alleviate the problem of low resolution of GradCAM, which obtains the heatmap at the middle layer. A method called guided backpropagation (GBP) is applied here to achieve the resolution of the input layer after upsampling the heatmap obtained by GradCAM (Eq. (3)) to the size of the input layer. GBP refers to an algorithm that calculates the gradient of the class score with respect to the network parameters in the same way as a typical BP algorithm, except for backpropagation at the ReLU nodes 54 . BP and GBP can be described by Eqs. (4) and (5), respectively, as follows: Here, l and g l+1 i denote the layer number and the gradient of a node i in a higher layer l + 1 . A l i is the activation of node i in lower layer l.
As shown in Eqs. (4) and (5), the gradient is not propagated to the lower layer if either the activation of the lower layer or the gradient of the higher layer is negative for the GBP, whereas it is not backpropagated only when the activation is negative for a normal BP. The GBP is repeated up to the input layer, and then the GGCAM heatmap is obtained at the resolution of the input layer by multiplying the GradCAM heatmap and the gradient obtained by Eq. (5).

Statistical analysis.
In this study, we calculated Pearson's correlation coefficients to examine the association between the cortical current density averaged over critical spatiotemporal regions and clinical/cognitive function scores 55 . A one-tailed test was performed to evaluate the strength of this relationship. Based on the subject demographics and cognitive test results, we hypothesized that critical spatiotemporal features would exhibit a negative correlation with clinical scores and a positive correlation with cognitive function scores.  Figure 4 presents the classification accuracies for the evaluation data from all SPs. The left panel of Fig. 4A shows a confusion matrix that summarizes the classification results of the 2dCNN classifier. The true positive rate for iRBD patients was 96.06% and the true negative rate for the normal controls was 95.62%. For the 3dCNN classifier, the training accuracy was 100 ± 0.00% for both the pretraining and fine-tuning stages. The evaluation on the test data showed the mean accuracy of 99.81 ± 0.32% (precision 99.77 ± 0.47%, recall 99.85 ± 0.47%, and AUROC 99.49 ± 0.01%). The left panel of Fig. 4B shows a confusion matrix summarizing the classification results for the 3dCNN classifier. The true positive and true negative rates were 99.77% and 99.85%, respectively, which demonstrated lower errors compared with the 2dCNN classifier. A statistical comparison of the 2d and 3dCNN classifiers showed that the classification performance of 3dCNN was significantly higher than that of 2dCNN (t(48) = 11.50, p < 0.001).
The classification performance was not significantly different among the structures, except that the training accuracy of the shallow structure increased slowly with respect to the number of iterations (Fig. S2). Fig. 5 present the distribution of relevance scores on the cortical surface (rearview) at 50 ms time intervals obtained by averaging the correctly classified test data from iRBD patients (rearview). The spatiotemporal distributions obtained by the two methods, LRP and GGCAM, were similar, that is, high scores were focused on similar spatiotemporal regions. Overall, the heatmaps obtained by 2dCNN were also close to those obtained by 3dCNN when the temporal epoch was carefully predetermined to 200-350 ms (Fig. 5C).

Critical spatiotemporal features of cortical activity. The heatmaps in
The critical cortical region revealed by 3dCNN + LRP was located around the right lateral occipital region (LO) at 200-500 ms, while 3dCNN + GGCAM yielded the bilateral occipital region at 100-400 ms, and the right superior parietal lobule (SPL) at 300-400 ms. The right LO was consistently identified in both LRP and GGCAM (Fig. 5A). Figure 5B shows the change in relevance scores with respect to time for the three critical cortical areas. For the LO region, the GGCAM score was the highest at 200-250 ms, while the LRP score was the highest at 300-350 ms. Both methods showed the highest values at 300-350 ms for the right SPL region (Fig. 5B).
The heatmaps in Fig. 6 present the distribution of relevance scores for incorrectly classified data during the critical temporal period (200-350 ms). It is remarkable that the interpretation of the 2dCNN and 3dCNN classifiers is inconsistent, with different regions identified as important for the prediction, which is clearly different www.nature.com/scientificreports/ for the case of correctly classified data (Fig. 5). The heatmap analysis results were inconsistent across the LRP and GGCAM results. We investigated the relationship between neural activity in the identified critical spatiotemporal ranges and cognitive function scores. Table 3 presents the results of the correlation analysis. Pearson's correlation analysis showed that the average cortical current density in the right SPL region in the critical temporal range (Fig. 5B) was negatively correlated with the RBDQ-HK score (rho = − 0.17, p < 0.05). The average cortical current density in the right SPL region in the critical temporal range was significantly correlated with the MMSE score (rho = 0.26, p < 0.01) for all subjects. For iRBD patients only, the correlation was also significant (rho = 0.31, p < 0.05).

Discussion
In this study, we showed that the use of 3dCNN is advantageous for characterizing the differentiation of spatiotemporal neural activity between iRBD patients and normal controls, as it does not require any a priori assumptions on the critical location and time. These findings suggest that our 3dCNN-based approach may lead to the identification of useful neuromarkers for brain activity underlying the abnormal cognitive function associated with iRBD.
The interpretation method of the classifier produced a heatmap indicating the contribution of the cortical activity of each localized region in the spatiotemporal domain to the prediction of iRBD patients. We confirmed that the identified spatiotemporal information was correlated with cognitive function scores and consistent with neurophysiological profiles, thus determining it to be a neuromarker reflecting spatiotemporal attention impairment in patients with iRBD.
Conventional statistical techniques often involve comparing averaged single-trial EEGs between groups to identify ERP patterns. However, machine learning techniques can examine characteristic patterns in all singletrial EEGs without averaging them, uncovering subtle patterns that may not be visible through traditional www.nature.com/scientificreports/ statistical approaches, and preventing loss of information. In this study, we used an explainable machine-learning technique to identify spatiotemporal information that consistently contributes to the prediction of classifiers by averaging individual heatmaps. Future research can explore the variations among patients and trials by analyzing individual heat maps in greater depth.  www.nature.com/scientificreports/ Cortical activities in the bilateral LO at 200-350 ms and right SPL at 300-350 ms were found to be critical in discriminating iRBD patients from normal controls. The LO region receives visual inputs in a bottom-up manner and is modulated by top-down attention 56 , thus playing a pivotal role in visuospatial attentional processing triggered by target stimuli 57 . The 200-250 ms period coincides with the latency of the N1 ERP component, which is known to be devoted to early visuospatial processing 23 . Therefore, we estimate that the neural activity of the LO region during this period is devoted to early visuospatial processing for the attentional task and may underlie the differences in neurobehavioral responses between iRBD patients and normal controls.
During the 300-350 ms period, the LO and right SPL regions were found to be critical and are known to reflect higher-order visual processing, which is modulated by top-down control of spatial attention 58,59 . This is also consistent with previous results that showed right hemisphere dominance in visuospatial processing 60 . Our results may be interpreted as reflecting a higher cognitive load for visuospatial processing in iRBD patients than in normal controls.
We identified a significant negative correlation between cortical current density in the right superior parietal lobule (SPL) region and the RBDQ-HK score. Specifically, SPL activity was negatively correlated with the severity of RBD symptoms, suggesting that a decline in SPL activity may be related to an increase in symptom severity. For patients with iRBD, the MMSE score was highly correlated with SPL activity. Previous studies have shown that the SPL region is critical for spatial working memory and attention 61 , and especially for the spatial memory of cue location and attentional control for target stimuli processing during a visual search task. Thus, it is expected that the decreased SPL activity during the 300-350 period underlies the cognitive decline of iRBD patients.
Both methods for the interpretation of the trained classifiers, LRP and GGCAM, yielded similar results in terms of the spatial and temporal locations of critical regions. For the LO region, there was a slight difference in the temporal epochs (LRP: 300-350 ms, GGCAM: 200-250 ms). The LRP results are based on the relative contribution of each node in the input layer to the output, whereas GGCAM scores the positive gradient of the output with respect to the activity of each input node. Thus, we interpret that LO activity during 200-350 ms is critical for the differentiated cognitive function associated with iRBD. The output for the classification was most sensitive to the earlier activity (200-250 ms), which is expected to be devoted to early visual perception, whereas the later activity (300-350 ms), which is expected to underlie visuospatial attention, greatly contributed to determining the classifier output.
Both 2dCNN and 3dCNN provided similar results in that the heatmaps showed similar spatial distributions when the temporal epoch was predetermined based on previous knowledge of cortical activities for visuospatial attentional processing 23,33,39 . The spatial information provided by the method suggested in this study could be interpreted as representing cortical dysfunction for attentional processing associated with iRBD. The spatial characteristics of abnormal cortical activity associated with iRBD identified in the current study are consistent with the metabolic/hemodynamic profiles revealed by functional neuroimaging 42,43 . An FDG-PET study revealed abnormal metabolic network activities in patients with iRBD, characterized by decreased activities in occipital regions, including the lateral occipital region, lingual gyrus, and precuneus, and increased activity in the medial frontal region 62 . In addition, an fMRI study showed altered resting-state thalamocortical functional connectivity associated with cognition in iRBD 63 .
For correct prediction, spatiotemporal features reflecting cognitive impairment of the patients seem to play an important role in the judgement of the classifier, whereas the distribution of critical spatiotemporal features seems to be inconsistent and uninterpretable for incorrect prediction. This is in line with a previous study on diagnosing and interpreting patients with lung disease using chest X-rays 64 . When the patients were correctly classified, disease-related localized areas were identified as important features for judgement, whereas other irrelevant areas were identified as important features for incorrectly classified data. Furthermore, a study utilizing MRI to predict Alzheimer's disease (AD) patients found that the features identified through interpretation methods in correctly predicted cases corresponded with the neuropathology of AD patients. Conversely, the features of incorrectly predicted cases are uninterpretable 65 .
In the case of 3dCNN, the critical temporal epoch was identified solely from the data without any a priori information and nearly coincided with the period assumed for the use of 2dCNN, which was based on previous ERP studies 23,33,39 . We also confirmed that the classification accuracy of the 2dCNN classifier was maximized when the temporal period was selected as that identified by the 3dCNN-based method. We expect that our results will provide a basis for further studies to identify the spatiotemporal characterization of the neural activity underlying abnormal cognitive function associated with various neurological/psychiatric disorders. Considering that the available screening methods for iRBD are rather limited in terms of both sensitivity and specificity (mostly below 85% accuracy) 66 , our methods based on the CNN classifier provide prospective alternative or supplementary tools for the screening of iRBD.
To verify whether the classifier was overly sensitive to small changes in the input data, we investigated the robustness of the classifier to noise by adding different noise levels to the input data. The experimental findings indicated that the proposed CNN classifier was unaffected by changes in the input data (Fig. S3). One way to assess a classifier's generalization performance is to add noise to the input data. This technique learns more resilient features that are less sensitive to minor deviations in input data. However, it is worth noting that excessive noise can impede the classifier's ability to recognize underlying patterns in the data, which may result in poor generalization outcomes. Hence, it is important to choose an appropriate noise level that is similar to the variations that the classifier may face in the clinical field.
For further analysis, we evaluated the generalization performance by cross-validating the model structure and adding different noise levels to the training data. The results confirm that our classifier is robust to noise and structure, resulting in low generalization error. In other words, we can conclude that the trained classifier has learned the underlying patterns and relationships in the data rather than simply memorizing the noise in the training data.

Conclusion
Here, we presented methods to identify the spatiotemporal characteristics of abnormal cortical activities associated with iRBD underlying cognitive dysfunctions, especially during a visuospatial attention task, based on CNN classifiers and an explainable machine learning approach. By finding the important nodes in the input layer that contributed most significantly to the output after successful training of the classifiers, the critical spatiotemporal region could be determined, which is expected to represent the difference between patients with iRBD and normal controls. The 3dCNN based method is beneficial in that the data-driven approach can be implemented without any a priori assumptions with high accuracy. Our method may contribute to further studies on the neural underpinnings of abnormal brain activity due to various neuropsychiatric diseases based on a relatively simple procedure using single-trial ERPs, which can be obtained from scalp EEG recordings.

Data availability
The data presented in this study are not publicly available because they contain information that can compromise the privacy of the research participants. Some of the data may be available from the corresponding authors upon request. The code and supplementary materials are available at GitHub: https:// github. com/ doste ps/ iRBD_ XML_ 3dCNN.